New dynamic construction techniques for M-tree

نویسندگان

  • Tomás Skopal
  • Jakub Lokoc
چکیده

Since its introduction in 1997, the M-tree became a respected metric access method (MAM), while remaining, together with its descendants, still the only databasefriendly MAM, that is, a dynamic structure persisted in paged index. Although there have been many other MAMs developed over the last decade, most of them require either static or expensive indexing. By contrast, the dynamic M-tree construction allows us to index very large databases in subquadratic time, and simultaneously the index can be maintained up-to-date (i.e., supports arbitrary insertions/deletions). In this article we propose two new techniques improving dynamic insertions in M-tree – the forced reinsertion strategies and so-called hybrid-way leaf selection. Both of the techniques preserve logarithmic asymptotic complexity of a single insertion, while they aim to produce more compact M-tree hierarchies (which leads to faster query processing). In particular, the former technique reuses the well-known principle of forced reinsertions, where the new insertion algorithm tries to re-insert the content of an M-tree leaf that is about to split in order to avoid that split. The latter technique constitutes an efficiency-scalable selection of suitable leaf node wherein a new object has to be inserted. In the experiments we show that the proposed techniques bring a clear improvement (speeding up both indexing and query processing) and also provide a tuning tool for indexing vs. querying efficiency trade-off. Moreover, a combination of the new techniques exhibits a synergic effect resulting in the best strategy for dynamic M-tree construction proposed so far.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...

متن کامل

A Multi-Criteria Decision-Making Approach with Interval Numbers for Evaluating Project Risk Responses

The risk response development is one of the main phases in the project risk management that has major impacts on a large-scale project’s success. Since projects are unique, and risks are dynamic through the life of the projects, it is necessary to formulate responses of the important risks. Conventional approaches tend to be less effective in dealing with the imprecise of the risk response deve...

متن کامل

The effect of Yazd-Eghlid railway construction on diversity and richness of shrub and Bush-tree rangelands inYazd province

Railway construction is one of the important activities in the development of any country and in developing countries, the need for roads is one of the main axes of development. Railway construction operations can effect on desert rangelands around railway. This study investigates the effects of Yazd-Eghlid railway construction on vegetation diversity and richness in  the rangelands of Kalmand-...

متن کامل

Improved Decision tree algorithm for data streams with Concept-drift adaptation

Decision tree construction is a well studied problem in data mining. Recently, there has been much interest in mining streaming data. Algorithms like VFDT and CVFDT exist for the construction of a decision tree but, as the new examples are added, a new model has to be generated. In this paper, we have given an algorithm for construction of a decision tree that uses discriminant analysis, to cho...

متن کامل

Revisiting M-Tree Building Principles

The M-tree is a dynamic data structure designed to index metric datasets. In this paper we introduce two dynamic techniques of building the M-tree. The first one incorporates a multi-way object insertion while the second one exploits the generalized slim-down algorithm. Usage of these techniques or even combination of them significantly increases the querying performance of the M-tree. We also ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Discrete Algorithms

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2009